Update `HomeserverTestCase.get_success(...)` and friends to drive async Rust (Tokio runtime/thread pool) by MadLittleMods · Pull Request #19871 · element-hq/synapse

MadLittleMods · 2026-06-19T22:36:01Z

Update HomeserverTestCase.get_success(...) and friends to drive async Rust (Tokio runtime/thread pool)

Spawning from adding some more async Rust things in #19846 and noticing that we have an existing pattern to use instead of the custom till_deferred_has_result(...) that has crept in to a few files.

Alternative to #19867 spurred on by this comment from @erikjohnston

Does this slow down the entire test suite?

.	Before	After
`trial (3.10, sqlite, all)`	7m - 8m 35s	TODO
`trial (3.10, postgres, 14, all)`	19m 53s	TODO

Dev notes

#19394 (comment) and #19734 (comment) discuss why you sometimes need to self.reactor.advance(0) before you can actually self.reactor.advance(...) in some cases and reasoning for why pump(...) may have become a thing.

Todo

Remove till_deferred_has_result
Remove wait_on_thread

Pull Request Checklist

Pull request is based on the develop branch
Pull request includes a changelog file. The entry should:
- Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
- Use markdown where necessary, mostly for code blocks.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
Code style is correct (run the linters)

…wrong

MadLittleMods · 2026-06-20T01:31:07Z

                event.room_version,
            ),
            exc=LimitExceededError,
-            by=0.5,


In a lot of cases, the by usage didn't seem necessary at all (test still passes) (no need to advance time in the reactor/clock)

MadLittleMods · 2026-06-20T01:37:23Z

        # whole chain to completion.
        self.reactor.pump([by] * 100)

-    def get_success(self, d: Awaitable[TV], by: float = 0.0) -> TV:


Removed the by arg as it encourages bad behavior (people use it as a hammer to advance time without reasoning to make things work) and we arbitrarily advance time 100x this amount (imprecise).

I've instead updated the few places that we use this with a precise self.reactor.advance(...) as necessary.

MadLittleMods · 2026-06-20T01:54:21Z

+        sync_d = ensureDeferred(
            worker_presence_handler.user_syncing(
                self.user_id, self.device_id, True, PresenceState.ONLINE
-            ),
-            by=0.1,
+            )
        )
+        # `user_syncing` proxies the presence write to the main process over an HTTP
+        # replication request. The request body is streamed by a `Cooperator` that uses
+        # the clock to schedule each chunk at a tiny *non-zero* delay (`_EPSILON`), so
+        # we need to actually advance the clock for it to fire.
+        self.reactor.advance(Duration(microseconds=1).as_secs())
+        self.get_success(sync_d)


This is the main pattern I'm recommending if you need to advance time by an non-zero increment. ensureDeferred works well but the name is a bit non-obvious to describe that we want to make the task run in the background on its own.

run_in_background(...) would also work but it's usage is a bit awkward. I guess we could use run_coroutine_in_background(...) instead 🤔

The difference between ensureDeferred(...) vs run_in_background(...)/run_coroutine_in_background(...) is all of the extra LoggingContext (log context) handling. It doesn't matter for tests though.

…te_room_membership_resume_after_restart`

…tdown.py`

MadLittleMods · 2026-06-20T02:15:34Z

+        # XXX: There can be a few already dispatched database queries (from normal
+        # background tasks in Synapse) and the threadless `ThreadPool` that we use in
+        # tests uses *untracked* clock calls to pass database results back so `shutdown`
+        # doesn't cancel those calls. This is a quirk of our test infrastructure
+        # (threadless `ThreadPool`) so this kind of "hack" is fine.
+        self.reactor.advance(0)


The explanation is slightly hand-wavey

…ocess_join_after_server_leaves_room` `wait_for_background_updates` is not relevant

MadLittleMods · 2026-06-20T02:30:40Z

        # Process the leave and join in one go.
        dir_handler.update_user_directory = True
        dir_handler.notify_new_event()
-        self.wait_for_background_updates()


As far as I can tell self.wait_for_background_updates() is totally bogus here. I assume the mistake here was because notify_new_event(...) uses run_as_background_process(...) but that's a totally separate thing (background updates != background process)

This made the test work because it does wait_for_background_updates(...) did a get_success(..., by=0.1) which pumped and advanced the reactor/clock.

But we can replace it with something more precise.

MadLittleMods · 2026-06-20T03:24:49Z

+            # reactor to run (like `reactor.callFromThread(...)`)
+            self.reactor.advance(0)
+
+    def get_success(


The primary change of this PR is changing get_success(...)/get_failure(...)` to be able to make progress on any awaitable that needs to do async Rust work.

The rest is just adjusting things because we removed the by arg (see other discussion) and stopped calling pump(...).

MadLittleMods · 2026-06-20T03:28:31Z

+    # FIXME: Remove as this has the exact same semantics as `get_success()`. In
+    # https://github.com/matrix-org/synapse/pull/8402#discussion_r495992506 where it was
+    # introduced, it was claimed that "get_success fails the test if the deferred fails
+    # rather than raising, which I find a bit unintuitive." but `get_success()` actually
+    # does raise "@raise SynchronousTestCase.failureException : If the
+    # L{Deferred<twisted.internet.defer.Deferred>} has no result or has a failure
+    # result." at-least in today's world.


I think this is accurate (follow-up PR)

MadLittleMods · 2026-06-20T03:30:10Z

-        duration_ms = 10
-        await self.clock.sleep(Duration(milliseconds=count * duration_ms))


Instead of making the sleep duration dependent on the count (dynamic), I've just just made it static so we can be precise with our time advancements below

MadLittleMods · 2026-06-20T03:34:40Z

            for callbable, args, kwargs in triggers:
                callbable(*args, **kwargs)

-    def till_deferred_has_result(


We can remove till_deferred_has_result because get_success(...) covers it on its own now

…tionTestCase.test_first_get_event_cancelled` Based on the same fix made in f22e7cd (f22e7cd)

MadLittleMods · 2026-06-20T04:09:04Z

+        # Checking `d.called` by itself is not sufficient by itself as this is possible:
+        #
+        # If you have a first `Deferred` `D1`, you can add a callback which returns
+        # another `Deferred` `D2`, and `D2` must then complete before any further
+        # callbacks on `D1` will execute (and later callbacks on `D1` get the *result*
+        # of `D2` rather than `D2` itself).
+        #
+        # So, `D1` might have `called=True` (as in, it has started running its
+        # callbacks), but any new callbacks added to `D1` won't get run until `D2`
+        # completes. Fortunately, we can detect this by checking `d.paused`.
+        while not d.called or d.paused:


This language is the same explanation given in f22e7cd

You can reproduce the problem with this test: SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.databases.main.test_events_worker.GetEventCancellationTestCase.test_first_get_event_cancelled

…iptable`

…r_stream_position_rdata`

MadLittleMods added 3 commits June 19, 2026 17:23

Refactor get_success(...) to allow other threads to make progress

520a4bc

Refactor get_failure

65a1c59

Fix get_failure(...) raises docstring

fdeed9a

MadLittleMods added rust Z-Rust labels Jun 19, 2026

Add changelog

5ca9050

MadLittleMods changed the title ~~Update get_success(...) and friends to drive async Rust (Tokio runtime/thread pool)~~ Update HomeserverTestCase.get_success(...) and friends to drive async Rust (Tokio runtime/thread pool) Jun 19, 2026

MadLittleMods added 10 commits June 19, 2026 17:39

Fix get_failure lint

6e9b2a2

Reduce timeout so you don't have to wait as long when something goes …

c45774c

…wrong

Fix test cases that don't need by=

ae7e367

Fix tests/storage/test_background_update.py

f54d0c0

Fix tests/app/test_homeserver_shutdown.py

997a160

Fix tests/handlers/test_presence.py

b501ad1

Fix tests/handlers/test_send_email.py

9cfd0f9

Explain why remove get_success_or_raise

66a515b

Extract logic to _wait_for_deferred

a1092da

Fix FIXME comment grammar

09c91d3

MadLittleMods commented Jun 20, 2026

View reviewed changes

MadLittleMods added 3 commits June 19, 2026 20:37

Use 1 second timeout default

4357aa4

Use "deferred" in _wait_for_deferred docstring

edce488

Add example if you need to advance time

5cc4590

MadLittleMods commented Jun 20, 2026

View reviewed changes

MadLittleMods added 2 commits June 19, 2026 21:06

Fix `tests.handlers.test_profile.ProfileTestCase.test_background_upda…

5b27102

…te_room_membership_resume_after_restart`

No need to change background update in `tests/app/test_homeserver_shu…

44253df

…tdown.py`

MadLittleMods commented Jun 20, 2026

View reviewed changes

Fix `tests.handlers.test_user_directory.UserDirectoryTestCase.test_pr…

47297af

…ocess_join_after_server_leaves_room` `wait_for_background_updates` is not relevant

MadLittleMods commented Jun 20, 2026

View reviewed changes

MadLittleMods added 2 commits June 19, 2026 21:45

Fix tests/handlers/test_typing.py

cc2c27b

Fix trial tests.replication.test_federation_ack

26dc512

MadLittleMods added 2 commits June 19, 2026 22:19

Fix lints

2bce6e7

Merge branch 'develop' into madlittlemods/better-get-success

3425d15

MadLittleMods commented Jun 20, 2026

View reviewed changes

Remove other till_deferred_has_result

ecce873

MadLittleMods commented Jun 20, 2026

View reviewed changes

MadLittleMods added 2 commits June 19, 2026 22:43

Explain better

2c51142

Fix `tests.storage.databases.main.test_events_worker.GetEventCancella…

999d22d

…tionTestCase.test_first_get_event_cancelled` Based on the same fix made in f22e7cd (f22e7cd)

MadLittleMods commented Jun 20, 2026

View reviewed changes

MadLittleMods added 3 commits June 19, 2026 23:17

Remove wait_on_thread

41642be

Fix trial-olddeps: `builtins.TypeError: 'type' object is not subscr…

350b15f

…iptable`

Fix `tests.replication.tcp.test_handler.ChannelsTestCase.test_wait_fo…

60ddfc6

…r_stream_position_rdata`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update `HomeserverTestCase.get_success(...)` and friends to drive async Rust (Tokio runtime/thread pool)#19871

Update `HomeserverTestCase.get_success(...)` and friends to drive async Rust (Tokio runtime/thread pool)#19871
MadLittleMods wants to merge 30 commits into
developfrom
madlittlemods/better-get-success

MadLittleMods commented Jun 19, 2026 •

edited

Loading

Uh oh!

MadLittleMods Jun 20, 2026

Uh oh!

MadLittleMods Jun 20, 2026

Uh oh!

MadLittleMods Jun 20, 2026

Uh oh!

MadLittleMods Jun 20, 2026

Uh oh!

MadLittleMods Jun 20, 2026

Uh oh!

MadLittleMods Jun 20, 2026

Uh oh!

MadLittleMods Jun 20, 2026

Uh oh!

MadLittleMods Jun 20, 2026

Uh oh!

MadLittleMods Jun 20, 2026

Uh oh!

MadLittleMods Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		duration_ms = 10
		await self.clock.sleep(Duration(milliseconds=count * duration_ms))

Conversation

MadLittleMods commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Does this slow down the entire test suite?

Dev notes

Todo

Pull Request Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MadLittleMods commented Jun 19, 2026 •

edited

Loading